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Abstract. In order to study the status and the possible evolution of clusters of galaxies at intermediate redshifts 
{z ~ 0.1 — 0.3), as well as their spatial correlation and relationship with the local environment, we built a sample 
of candidate groups and clusters of galaxies using radiogalaxies as tracers of dense environments. This technique 
- complementary to purely optical or X-ray cluster selection methods - represents an interesting tool for the 
selection of clusters in a wide range of richness, so to make it possible to study the global properties of groups 
and clusters of galaxies, such as their morphological content, dynamical status and number density, as well as 
the effect of the environment on the radio emission phenomena. In this paper we describe the compilation of a 
catalogue of ~ 16 000 radio sources in the region of the South Galactic Pole extracted from the publicly available 
NRAO VLA Sky Survey maps, and the optical identification procedure with galaxies brighter than bj — 20.0 in 
the EDSGC Catalogue. The radiogalaxy sample, valuable for the study of radio source populations down to low 
flux levels, consists of 1288 identifications and has been used to detect candidate groups and clusters associated 
to NVSS radio sources. In a companion paper we will discuss the cluster detection method, the cluster sample as 
well as first spectroscopic results. 
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1. Introduction 

One of the major topics in modern cosmology concerns the 
dynamical status and evolution of groups and clusters of 
galaxies, as well as their abundance and spatial distribu- 
tion, their morphological content and interactions with the 
environment. Groups and clusters of galaxies are indeed 
the largest, gravitationally bound, observable structures, 
and by studying their properties and the processes under- 
lying their formation much can be understood about the 
global cosmological properties of the universe. 

In recent years, significant efforts have been made in 
searching for clusters at high redshifts; nevertheless the 
general properties and the physical processes at work in 
these large scale structures at moderate z are still un- 
clear. To this aim, it is of fundamental importance to 
gather cluster samples representative of different dynam- 
ical structures - from groups to rich clusters - in a wide 
range of redshift and covering large areas of the sky. 

First attempts to build wide-area cluster samples, like 
the ACO/Abell catalogue (Abell et al. |1989D, were based 
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on visual inspection of optical plates and only recently the 
first catalogues obtained through objective algorithms ap- 
peared ( EDCC, Lumsden et al. |1992|; APM, Dalton et al. 



1994). The selection based on optical plates, however, lim- 



its the redshift range to about z < 0.2 and suffers from 
misclassifications due to projections effects along the line 
of sight, resulting on one side in spurious cluster detec- 
tion and, on the other side, in wrong estimates of the 
cluster richness, that can affect the reliability of the de- 



rived cosmological parameters (van Haarlem et al. 1997). 
Alternatively, cluster samples at higher redshift have been 
built using a matched filter algorithm which makes use 
of both positional and deep multiband CCD photometric 
data over selected areas of few s quare degrees (Postman 
et al. |1996|; Scodeggio et al. |l999D. 
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Also, to find candidate clusters at intermediate red- 
shift through color diagrams alone could bias the selec- 
tion against clusters with a high fraction of blue galax- 
ies, whose presence can be due to the oc curren ce of the 
Butcher-Oemler effect (Butcher & Oemler 1984 ) or to the 
fact that the cluster itself can be in the process of forma- 
tion. 
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The X-ray emission properties of the hot intracluster 
medium have been widely used to build distant cluster 
samples, but this technique suffers from the limited sensi- 
tivity of wide-area X-ray surveys and from the possibility 



of evolutionary effects (Gioia et al. 199C ; Henry et al. 1992 
RDCS, Rosati et al. |1998D . 



Even more critical is the selection of groups of galaxies: 
these structures - which represent a sort of "bridge" be- 
tween rich clusters and the field - are of major interest for 
the understanding of galaxy interactions and evolution- 
ary processes, but their detection is particularly difficult 
even at moderate redshifts due to their very low density 
contrast with respect to field galaxies distribution. 

A different approach - complementary to purely op- 
tical or X-ray cluster selection methods - is the use of 
radiogalaxies as suitable tracers of de nse e nvironments. 
In re cent studies (Prestage & Peacock 198S; Hill & Lilly 
1991|; Alhngton-Smith et al. |1993|; Zirbel |1997|; Miller et 



al. 1999) it has been shown that Faranoff-Riley I and H 



radio sources are found in different environments, and dif- 
fer in the optical properties of their host galaxies as well. 
FRI sources are found on average in rich groups or clusters 
at any redshift, and are associated with elliptical galaxies, 
with the most powerful FRIs often hosted by a cD or dou- 



ble nucleus galaxy (Zirbel 1996). FRII radio sources are 
typically associated with disturbed ellipticals and avoid 
cD galaxies, and at z ~ 0.5 FRIIs are found in a wide 
range of environments, including many rich clusters which 
rarely, if ever, host a FRH radio source at low redshift 



(Zirbel |1996| ; HiU & Lilly |1 99 1| ). 

Radio selection should not impact on the X-ray or op- 
tical properties of the cluster found in this way, since there 
is no significant correlation between the radio proper ties of 
galaxies wit hin a cluster with its Lx (Feigelson et al. 1982| ; 
Bu rns et al. 1994 ), or with richn ess of the cluster (Zhao et 



al. [1989| Ledlow & Owen 1996 ). Moreover, since no cor- 
relation exists between the properties of group members 
and the radio characteristics of the radiogalaxies, radio- 
selected groups can be used t o stud y the general evolution 
of galaxies in groups (Zirbel 1997 ). 

Radiogalaxies can thus be used as tracers of dense envi- 
ronments at any epoch, and the evolution of galaxy groups 
and clusters can be studied lessening those biases that are 
the main drawbacks of pure optical or X-ray selected clus- 
ter samples. 

A further point that makes this selection technique 
interesting is the possibility to investigate the effects of 
the environment on the radio-emission phenomena. Zirbel 
( |1997| ) speculates the possibility of two distinct scenar- 
ios for the fueling of radio emission in FRI and FRII 
sources. The difference in the environments of FRII radio 
sources at low and high redshift suggests that the condi- 
tions to form such sources have changed with epoch, and 
the characteristics of their optical counterparts are con- 
sistent with the hypothesis of FRII radio emission being 
fueled by galaxy encounters. For FRI radio sources, it is 
suggested the possibility of them being drawn from differ- 
ent galaxy types, and being triggered by different mecha- 



nisms, depending on their power. The most powerful FRI 
sources are typically dominant galaxies and their environ- 
ments seem to be consistent with the possibility of them 
being cooling flow galaxies: in this scenario, the cooling 
flow itself can provide the fuel for the radio source. The 
less powerful FRI sources do not always correspond to the 
first ranked galaxy, are not always found in the centre of 
the potential well, and some reveal signs of galaxy inter- 
actions (see e.g. Baum et al. 198S). It seems thus unlikely 
that the less powerful FRIs can be cooling flow galaxies, 
and the radio emission could be triggered by a different 
mechanism with respect to more powerful FRIs. 

This scenario suggests that the radio source morphol- 
ogy is not only a function of the radio pow er, as suggested 
by theoret ical models (Bridle 



Perley 1984; BickneU 



1984 , 1986), but depends also on the epoch of observa- 
tion, that is the density and evolution of the intracluster 
medium. In this sense, the study of radio-selected groups 
and clusters over a wide range in radio power may help 
in understanding the physics of radio emission and the 
relationships between different classes of AGN. 

To build such a sample of radio-traced clusters, the 
new radio surveys NRAO VLA Sky Survey (NVSS, 
Condon et al. 1998) and Faint Images of the Radio Sky 



at Twenty-centimeters (FIRST, Becker et al. |1995|) offer 
an unprecedented possibility to study a wide-area, homo- 
geneous sample of radio sources down to very low flux 
levels, together with a positional accuracy suitable for op- 
tical identifications. 

Recently, Blanton et al. (2000) looked for moderate 
to high redshift clusters associated with a sample of ra- 
dio sources from the FIRST survey, having a bent-double 
radio morphology. The presence of a distorted radio struc- 
ture may be the consequence of the relative motion of the 
host galaxy in the intracluster medium, or of tidal interac- 
tions with other cluster galaxies, and thus can be used as 
an indicator of the presence of a cluster or group surround- 
ing the radio source. From R-band imaging of the fiel d sur- 
rounding bcnt-doublc radio sources, Blanton et al. (2000) 
selected ten candidates for multislit spectroscopy, and for 
eight of them they found evidence of a cluster associated 
to the radiogalaxy, with measured richnesses ranging from 
Abell class to 2. As FRI sources more frequently show 
a distorted morphology, this sample contains mostly FRI 
radiogalaxies. Moreover, due to its high resolution, the 
FIRST survey may resolve out extended sources, making 
the FRI/FRII classification difhcult. 

The lower angular resolution of the NVSS survey 
makes this survey more suitable than the FIRST for the 
detection of extended regions of low surface brightness. 
We used the publicly available radio maps in the NVSS 
to build a sample of radio-optically selected clusters as- 
sociated to FRI and FRII radio sources over a wide area 
in the sky. In this paper we describe the compilation of a 
radio source catalogue and the optical identification proce- 
dure with galaxies in the EDSGC Catalogue (Nichol et al. 

that led to the compilation of the radi ogalax y sam- 
ple. In a companion paper (Zanichelli et al. 2001, Paper 
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II) we will present the cluster selection method and the 
sample of candidate clusters we have obtained, as well as 
first spectroscopic results. 

This paper is structured as follows: in Sect. ^ we give 
a description of the characteristics of NVSS radio data 
and discuss the need to compile a radio source catalogue 
in alternative to the NVSS publicly available one. The 
extraction of the radio source catalogue is presented in 
Sect. H, together with a discussion on the classification 
of double radio sources. The radio source catalogue and 
its properties are discussed in Sects. ^ and |5[ In Sect. || 
and following the optical identification procedure and the 
obtained radiogaiaxy sample are described. 



2. The radio data 

In this work we make use of data from the NRAO VLA 
Sky Survey (Condon et al. 



19981) . The NVSS started in 



1993 with the VLA in D and DnC configurations and has 
recently been completed. The NVSS covers the whole sky 
north of (5 = —40° at the frequency 1.4 GHz with resolu- 
tion 45". 

Data products consists of 2326 w 4° x 4° maps in 
Stokes I, Q, and U with pixel size 15" and rms bright- 
ness fluctuations 0.45 mJy beam~^. The positional rms 
in Right Ascension and Declination varies from < 1" for 
relatively strong (5 > 15 mJy) point sources to 7" for the 
faintest (S = 2.3 mJy) detectable sources. 

The positional accuracy, together with the low flux 
limit and moderate resolution of the survey makes the 
NVSS particularly suitable for the detection of low-surface 
brightness extended structures and for the search of opti- 
cal counterparts of radio sources. 

A list of about 2 x 10^ discrete sources is available as 
well, and has been extracted from the survey images by 
fitting ellipti cal Gaussians to all significant peaks (Condon 
et al. 1998 ). In the compilation of this list, hereafter 
NVSS-NRAO catalogue, no attempt is made to classify 
sources according to their morphology (double or point- 
like sources). 

Nevertheless, when one wants to make optical identi- 
fications, a crucial point is the knowledge of the source 
structure. If a double source, for which we can expect to 
find the optical counterpart near the radio barycentre po- 
sition, is erroneously treated as two single components, 
the identification procedure can lead to misleading results, 
thus seriously affecting the completeness and reliability of 
the identification program. 

The blind use of a list of fitted components like the 
NVSS-NRAO catalogue is thus not optimal if one wants 
to get a radiogaiaxy sample characterized by well defined 
statistical properties, suitable for further astrophysical ap- 
plications. For this reason, we developed our own algo- 
rithm for the extraction of a radio source catalogue from 
the radio maps and for the morphological classification 
of the detected sources, as will be discussed in the next 
Sections. 



3. The radio source extraction algorithm 

The operations performed by the extraction algorithm 
are divided in five modules: the source detection, the 1- 
Gaussian fit module, the evaluation of fit reliability, the 
2-Gaussian fit module and the classification of double 
sources. More details on the operations performed by the 
algorithm are given in Appendix ^ 

3.1. Source detection 

The algorithm reads each NVSS FITS map, consisting 
in a 1024 x 1034 pixel matrix (Ipixel = 15"), and then 
looks for emission peaks: we adopted a threshold fiux of 
Sp = 2.5 mJy beam~^, corresponding to the 5ct level 
for the NVSS survey (rms noise on NVSS I images is 
« 0.45 mJy beam~^, Condon et al. 1998 ). A different 
detection threshold has been applied to two sky regions 
where strong residual diffraction lobes due to the presence 
of a very bright (^ 2.5 Jy) and extended source are found. 
To avoid detecting a large number of spurious sources, for 
these regions we evaluated the local noise and selected 
only those peaks with Sp > 5criocai- 

A submatrix of 15x15 pixels (^ 3.8' x 3.8') around each 
peak is built, defining the region over which the operations 
described in the next Sects, are executed. 



3.2. Fit with one Gaussian component 

A fit with a circular Gaussian function of fixed FWHM = 
45" is performed over each submatrix (see Appendix |a|) ; 
the FWHM of the fitting function has been chosen to re- 
produce the nominal beam of the NVSS. 

The use of a fixed FWHM has the consequence that 
it is not possible to determine the angular dimension 
- and thus the integrated flux - of the radio sources. 
Nevertheless, some tests showed that the use of a Gaussian 
of variable size is not advantageous when fitting sources 
at low flux levels (< 8cr), whose resulting positions and 
fluxes were found to be inaccurate. We decided thus to 
fix the dimensions of the fit function and to perform a fit 
with two Gaussian components in those cases when the 
one-component fit was not satisfactory. In Sect. 3.3 the 
criteria for performing a 2-component fit are described. 

Input parameters for the 1-component fit are x and 
y peak pixel coordinates of the submatrix central point, 
and the measured flux at that pixel. For each source, the 
algorithm computes the fit rms Sifit (see Appendix 
which is used as a discriminant for the execution of the 
2-Gaussians fit. 



3.3. Fit reliability 

Inspecting the results obtained from the 1-component fit 
for some test sources, we found that they are not satis- 
factory in terms of positional and photometric accuracy 
when the fit rms Eigt > 0.6 mJy pixel^^. The distribu- 
tion of Sifit values in different fiux bins showed that the 
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percentage of sources with Sp < 5.0 mJy beam~^ for 
which Eifit > 0.60 mJy pixel" ^ is reasonably low, of the 
order of 10%. We thus decided to apply a 2-component fit 
only to those sources with both Eigt > 0.60 mJy pixel"^ 
and Sp > 5.0 mJy beam~^. If it happens that I]2fit > ^mt 
(« 4% of these sources), then the 1-component solution is 
restored. 

There are however two categories of sources for which 
the above criterion did not guarantee good results with 
the l-component fit, and required a different approach: 
these cases are what we called "extended" and "multiple" 
sources. 

When in presence of "extended" sources, whose flux 
distribution presents a "plateau" instead of a well defined 
maximum, the algorithm can detect more than one emis- 
sion peak, and attempts to perform as many fits: this 
happens to about the 6% of the sources fitted with 1 
Gaussian component, with no dependence on their fiux. 
It has been possible to identify two different situations: 
if the distance between the fitted positions is less than 
4" then is always Eiat < 0.60 mJy pixel"^. For distances 
between 4" and 45", on the contrary, at least one fit has 
Sifit > 0.60 mJy pixel"^. In the first case we verified 
that 1 Gaussian with fixed FWHM reproduces the source 
correctly: the 1-component fit is considered valid, by as- 
signing to the radio source the position of the barycentre 
of the multiple fits. In the second case, the effect of the 
source extension is not negligible and the highest values 
of Eifit and Sp among those fitted are attributed to the 
source, which is thus forced to the 2-component fit. 

A further class of sources, the "multiple" ones, has 
been identified during the implementation of the 2- 
componcnt fit module: the dimension of the fit submatrix 
is such that the number of times it contains two sources is 
not negligible. We found that the presence of more than 
one source in the same fit submatrix seriously affects the 
minimization process and the accuracy of fitted parame- 
ters. 

We took into consideration these situations by intro- 
ducing the following criterion: all those sources having a 
neighbour inside 2.5', with at least one of them having 
Sifit > 0.6 mJy pixel"^, are considered "double". In such 
a case, a new fit submatrix is defined around the central 
position between the two components and the source is 
forced to the 2-component fit. A distance of 2.5' guaran- 
tees that the structure of both components is well repre- 
sented in the region defined by the fit submatrix. 

To keep track of the different operations and adopted 
criteria, multiple and extended sources have been marked 
with control fiags. A further analysis of the classification 
of double sources has been made as the final step in the 
construction of the radio source catalogue (see Sect. 

3.4. Fit with two Gaussian components 

The 2-component fit models sources with two Gaussian 
functions, each having FWHM = 45". Input parameters 




D (arcsec) 

Fig. 1. Distribution of the separation between compo- 
nents, D, for 660 double catalogue radio sources (solid 
line) and 409 random doubles (dashed line), belonging to 
the 6 maps we examined (see text). 



needed to describe the two functions are: x and y peak 
pixel coordinates of the submatrix central point, peak flux, 
distance in x and y between the two components (in pixels 
from the barycentre), logarithmic ratio of fluxes of the two 
components. Obviously, this amounts to assume that the 
brightness distribution of the source is modelled as the 
sum of two pointlike sources. 

Even if for true double radio sources it is seldom found 
a flux ratio S1/S2 > 4, we allowed this parameter to be 
as high as 10, with a lower limit for the fiux of a single 
component S'p = 1.5 mJy beam~^. This choice proved 
to be useful to correctly fit the fiux of those "extended" 
sources discussed in Sect. 3.3. In fact, due to our choice 

ex- 



of fixed-size Gaussian functions, when dealing with 
tended" sources for which a second peak is not detected, 
the extraction algorithm may need a second component 
to correctly fit the source flux. 

For each double source the algorithm evaluates total 
fiux and barycentre position, as well as fiux, coordinates 
and separation of the two components. The fit rms S2fit is 
computed similarly to what is done for the 1-component 
fit; if I]2fit > 2]ifit, and depending on the source control 
fiags (if there are any), the 1-component fit may have been 
considered valid. 

3.5. Classification of double radio sources 

The distinction made by the algorithm between single or 
double sources is strongly influenced by the sky distribu- 
tion of radio sources and by the characteristics of the fit 
procedure, so that a certain number of spurious associa- 
tions classified as double on the basis of a positional coin- 
cidence of single, non interacting components is expected. 
A further step in the compilation of the radio source cat- 
alogue is thus the estimate of the fraction of double radio 
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Fig. 2. Histogram of fluxes for pointlike (a) and double (b) radio sources in the catalogue. There are 4 pointlike 
sources with LogSp > 3.0 and 1 double source with LogSp > 3.4 not shown in these plots. 



sources that could have been so classified on the basis of 
the chance coincidence of two unrelated sources. 

Given the observed surface density of NVSS radio 
sources, and under the hypothesis that all of them are sin- 
gle sources, we looked at the probability that a source has 
by chance a neighbour assuming a completely random sky 
distribution. We considered regions of 4° x 4° belonging to 
6 NVSS maps we analyzed. For each region we generated 
5 random samples each containing as many positions as 
the detected NVSS sources (i.e. 1 x risingic + 2 x ridoubios), 
associating to them values of the flux randomly chosen 
among the measured ones. We then looked for pairs in 
the random samples, i.e. sources having a neighbour in- 
side 2.5', that is the maximum distance we allowed for the 
classification of a double source. 

We detected an average of 409 spurious doubles over 
the 6 maps we took into consideration for this analysis. 
Over the same sky region, there are 660 double radio 
sources in the catalog. In Fig. ^ the distributions of the 
distance between components for catalogue double sources 
and random double sources are shown. 

We distinguish three different contamination situa- 
tions depending on the separation D between the radio 
components. When D < 50" the mean contamination is 
about 13% and we decided to keep as valid the algorithm 
classification: hereafter we will call these "close" double 
radio sources. For separations larger than 100" the prob- 
ability of chance coincidence is such that we can consider 
all of them as spurious doubles: the 709 radio sources be- 
longing to this interval have thus been included as two 
single components among the list of single sources. In the 
interval between 50" and 100" a decision can hardly be 
made: the contamination rate for these radio sources is 
high 61%) but the number of expected true doubles is 
not negligible. In order not to miss the corresponding opti- 



cal identifications, we kept these sources (hereafter "wide" 
doubles) among the double ones, but for this group we fol- 
lowed a more careful procedure during the search of optical 
counterparts (see Sect. 0). 

Our estimate of the number of random doubles, de- 
rived under the assumption of a uniform sky distribution 
of radio sources, does not take into account any effect due 
to clustering properties of radio sources. However, there 
is indication that the clustering of radio sources on an- 
gular scales greater than the NVSS resolution is weak 
(Magliocchetti et al. 1998 ), and thus it should not sig- 
nificantly alter our results. 

Due to our choice of a maximum separation between 
radio source components of 100", our catalogue of double 
radio sources does not include the class of "giant" doubles. 
For a reliable detection of such sources, additional radio 
data with a better angular resolution than that provided 
by the NVSS survey alone would be needed, to allow the 
determination of the morphological type and of the com- 
pact core component necessary for the identification of the 
optical counterpart. Samples of giant radio sources have 
been selected on the basis of many different criteria on 
their angular size, radio power and optical magnitude (see 
e.g. Cotter et al 1996; Machalski et al. 2001) so that it is 
not straightforward to give an estimate of the expected 
number of missed giants in our catalogue. 



4. The radio source catalogue 

The extraction algorithm has been applied to 31 NVSS 
maps in the region of the South Galactic Pole. The algo- 
rithm classified as double 3371 radio sources: among these, 
709 have distance between components > 100" and have 
been included in the list of pointlike radio sources. 
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The resultant catalogue consists of 13 340 single 
sources and 2662 double radio sources over « 550 sq. de- 
grees of sky. The distribution of peak fluxes is shown in 
Fig. ^. The flux do not include any correction for the 
"CLEAN bias" -0.3 mJy beam-^ for the NVSS): 
this has been taken into account when comparing our cata- 
logue with the list of NVSS source components distributed 
by the NRAO (see Sect. ||). 

The catalogue is complete down to the NVSS flux 
limit 2.5 mJy beam~^ and, as can be seen in Fig. ^, 
positional uncertainties estimated by the algorithm are 
in good agree ment w ith those expected for NVSS sources 
(Condon et al. |1998| ). 

The electronic version of the radio source catalogue is 
available at the Centre de Donnees de Strasbourg (CDS). 



5. Tests on the catalogue 

As the accuracy in positions and fluxes can influence the 
photometric completeness of the radio source catalogue as 
well as reliability and completeness of the optical identifi- 
cations, before looking for radio source counterparts some 
tests to verify the algorithm stability and reliability in 
computing positions and fluxes have been performed on 
the catalogue. 

5.1. Comparison with the NVSS-NRAO catalogue 

A first test to asses the accuracy of fluxes and positions 
computed by the radio source extraction algorithm de- 
scribed in the previous Sects, has been made with compar- 
ison to the NVSS-NRAO catalogue (Condon et al. |1998D . 
The NVSS-NRAO catalogue does not provide any classi- 
fication in single or double radio sources, and simply gives 
a list of components fitted with an elliptical Gaussian of 
variable size: for this reason, we restricted this quantita- 
tive analysis only to the single radio sources in our cata- 
logue. A qualitative analysis of double radio sources has 
been made by visual inspection and described below. 

We extracted from our catalogue a set of 323 point- 
like radio sources belonging to the central 3x3 square 
degrees of the NVSS map 10016M24, and compared their 
positions and fiuxes with those found in the NVSS-NRAO 
catalogue. To take into account the dependence of posi- 
tional accuracy on the source flux, this analysis has been 
made in the three flux intervals S'p < 4 mJy beam~^, 
A < Sp < 8 mJy beam~^ and S'p > 8 mJy beam~^. 
The modules of the mean differences in Right Ascension 
and Declination were found to vary from « 0.5 to ~ 0.02" 
with a dependence on flux, with rms varying from ~ 2" 
for the lowest fluxes to ^ 0.3" for sources brighter than 
8 mJy beam^^. 

Photometric accuracy has been tested by comparing 
peak fluxes in the NVSS-NRAO catalogue with those de- 
termined by our extraction algorithm: these last result to 
be on average slightly underestimated, the median of the 
differences being ASp ~ —0.3 mJy beam^^, of the or- 



der of the "CLEAN bias" term for which the published 
NVSS-NRAO ffux values have been corrected. 

In Fig. ^ the offset distribution in Right Ascension, 
similar to the one in Declination, and the difference in 
fluxes between our catalogue and the NVSS-NRAO one 
are shown for the 323 considered sources. 

During this analysis we found that in some cases a 
bright pointlike radio source is split in more than one 
component by the NVSS-NRAO extraction algorithm. 
This fact can be ascribed to the use of a totally auto- 
matic extraction procedure, needed when managing such 
a huge amount of data. Nevertheless, it points out that 
the "blind" use of a component catalogue like the NVSS- 
NRAO one for optical identifications of radio sources can 
introduce contamination and incompleteness effects in the 
sample of of optical counterparts. 

As the NVSS-NRAO catalogue does not make any at- 
tempt in classifying double radio sources, a similar quan- 
titative comparison has not been possible for double sys- 
tems and we limited our analysis to the visual inspection 
of 68 cases of "close" and "wide" doubles in the catalogue. 
"Close" pairs are generally fitted with 1 component in the 
NVSS-NRAO catalogue and we found a good agreement 
between the NVSS-NRAO component position and the 
barycentre in the radio source catalogue. A different situ- 
ation exists for "wide" pairs in the radio source catalogue: 
their radio structure, as seen on the radio maps, is typ- 
ical of classical double radio sources. In such cases, for 
which the NVSS-NRAO catalogue lists only the positions 
of the two components, the optical counterpart is clearly 
to be searched near the radio barycentre position and thus 
would be missed if one makes a blind use of the NVSS- 
NRAO catalogue. 

5.2. Stability and reliability of the algorithm 

A further test has been made on the radio source cata- 
logue by using sources in the overlapping regions between 
adjacent maps to check the stability of the extraction al- 
gorithm in reproducing fluxes and positions. 

We compared catalogue data relative to 120 sin- 
gle sources (half of them with fluxes larger than 
10 mJy beam^^) and 53 double sources with those ob- 
tained by fitting the same sources with the AlPS task 
JMFIT. We again found consistency with what predicted 
for NVSS sources: errors on the positions of single sources 
vary from « 3" at fluxes lower than 5 mJy beam~^, to 
0.2" for S'p > 15 mJy beam~^. For the double sources we 
find slightly larger values, ~ 2" for S'p > 15 mJy beam~^. 
We can conclude that both for pointlike and double ra- 
dio sources in our catalogue, the positional accuracy is 
good enough to allow optical identifications with galaxies 
brighter than 6j = 20.0 down to the NVSS flux limit. 

The variation of catalogue peak fluxes and peak fluxes 
obtained with JMFIT is < 1% for single sources, similar 
to the values found examining radio sources in the over- 
lapping regions; for double sources the fractional variation 
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Fig. 3. Positional uncertainties estimated by the radio source extraction algorithm for single (a) and double (b) radio 
so urces as a function of peak flux. The dot-dashed line represent the errors derived from the form ulae in Condon et 
al. 1998 . In (b) the dashed vertical line represents the flux limit for the 2 components fit (see Sect. ^.4[ ). 
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Fig. 4. a Distribution of differences in Right Ascension for 323 pointlike radio sources in our catalogue with respect to 
NVSS-NRAO catalogue positions, b NVSS-NRAO fluxes vs. radio source catalogue ones for the same set of sources. 
There are 4 off-plot sources brighter than 160 mJy beam~^. 



reaches the > 5% for fluxes larger than 15 mJy beam"-'^. 
This higher value can be ascribed to a difficulty in rep- 
resenting the source with two Gaussians of fixed FWHM 
as the source flux increases. However, these results can be 
considered satisfactory as the uncertainties are not such 
to compromise the reliability of the optical identifications. 

6. The optical identification procedure 

The optical identification procedure has been applied sep- 
arately to the three classes of NVSS radio sources in our 
catalogue: pointlike, "close" doubles and "wide" doubles. 



"Wide" doubles are in fact affected by a non-negligible 
probability of being erroneously classified as double sys- 
tems by our extraction algorithm, that is, we do not know 
a priori when the optical counterpart is to be expected 
near the radio barycentre, which we assume as the most 
likely position if the classification is correct (Venturi et al. 



1997; Prandoni et al. 2001) or near the components. We 



thus have followed a careful approach in the identification 
of these sources, as will be detailed in Sect. 0. 

"Close" doubles have a low probability of being spuri- 
ous associations of single components, so that in principle 
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Fig. 5. Radio-optical offsets in Right Ascension (a) and Declination (b) for 218 barycentres of NVSS radio sources 
in the catalogue (empty areas) and 62 contaminants from random samples (shaded regions), optically identified with 
galaxies inside a box of size 40". 



they could be treated as the pointlike sources during the 
identification phase, by looking for counterparts near their 
barycentres. Nevertheless, pointlike and "close" doubles 
have been kept distinct in the identification phase, since 
we verified (see Sect. 6^) that for "close" doubles it is not 
possible to fulfil the requi remen ts of the Likelihood Ratio 
method (De Ruiter et al. 1977 ). This method - to be ap- 
plied in order to keep the contamination from radio-optical 
chance coincidence sufficiently low in the radiogalaxy sam- 
ple - has been used only for the list of optical counterparts 
of pointlike radio sources. 

In the next Subsections we discuss the properties of the 
optical data used for the compilation of the radiogalaxy 
sample and the determination of radio-optical positional 
uncertainties, necessary to define the optimal radius for 
the search of optical counterparts. More details on the 
Likelihood Ratio method and on results from simulated 
samples are given in Appendix 

6.1. Optical data 

Optical identifications of radio sources in the NVSS cat- 
alog have been made with galaxies in the Edinburgh- 
Durham Southern Galaxy Catalogue (EDSGC, Nichol 
et al. 2000). The EDSGC lists photographic bj magni- 
tudes for « 1.5 X 10^ galaxies over a contiguous area of 
~ 1200 sq. degrees at the South Galactic Pole. For the 
construction of the catalogue glass copies of 60 plates 
Ilia- J of the ESO/SERC Sky Survey at galactic lati- 
tude |6ii| > 20° have been digitalized with th e micr oden- 
sitometer COSMOS (MacGillivray & Stobie, |l984[) . The 
automatic algorithm for star/galaxy classification imple- 
mented in COSMOS has been optimized so to achieve a 
completeness greater than 95% and stellar contamination 



less than 12% for magnitudes 6j < 20.0. Magnitudes have 
been calibrated via CCD sequences, providing a plate-to- 
plate accuracy of A&j ~ 0.1 and an rms plate zero-point 
offset of 0.05 magnitudes. 

The EDSGC catalogue becomes rapidly incomplete 
above 6j ~ 20.5, thus we decided to consider only those 
galaxies with bj < 20.0. With this conservative choice, 
the properties of the final identification sample (in terms 
of optical completeness and contamination) are well con- 
sistent with the global ones of the EDSGC. 



6.2. Positional uncertainties and search radius 

The search radius for optical identifications must be care- 
fully chosen as it affects the completeness and reliability of 
the obtained radiogalaxy sample (see Appendix The 
optimal search radius is usually chosen on the basis of the 
total positional uncertainties, i.e. the combination of the 
radio error (comprehensive of the uncertainty introduced 
by the fit) which depends on source flux, and the accuracy 
on optical positions. 

To avoid any kind of assumption on this term, we 
empirically determined the radio-optical positional ac- 
curacies from the distribution of the measured radio- 
optical offsets. Given the uncertainty in the classification 
of "wide" double radio sources, we made this analysis only 
for pointlike and "close" double sources. 

We have first identified the 13 340 pointlike radio 
sources in the catalogue with EDSGC galaxies brighter 
than magnitude bj = 20.0, looking for the nearest object 
inside a large square region of size 40", centered on each 
radio position. The distributions of the observed radio- 
optical offsets in a and 5 have been analyzed in different 
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Table 1. Gaussian estimates of the total errors on coor- 
dinates, a a and as , obtained from a fit on the Aa and A(5 
distributions of optical identifications for catalogue point- 
like radio sources. For each one of the 5 considered flux 
intervals it is also given the mean number of contaminants 
per distance bin Cmed/bin, estimated by means of the 4 
control samples, which represent the "pedestal function" 
over which the Gaussian distribution of the true identifi- 
cations lies. 



5p( mjy beam ^) 


Cmed/bin 


(") 


(") 


Sp < 3.5 


3 


5.17 


5.28 


3.5 <Sp < 4.6 


2 


4.11 


5.76 


4.6 <Sp < 7.5 


3 


3.16 


3.33 


7.5 <Sp < 15.0 


3 


2.04 


2.27 


> 15.0 


3 


2.21 


1.88 



bins of radio flux, selected in order to contain approxi- 
mately the same number of counterparts. 

Each offset distribution is the sum of two distinct dis- 
tributions: a flat one due to the uniform distribution of 
spurious counterparts, plus a Gaussian one due to the 
true radio-optical associations. To estimate the rms of 
this Gaussian, which is the desired positional uncertainty, 
we first obtained an accurate measure of the mean level 
of contaminants by making optical identifications of ran- 
domly generated samples. We built 4 control samples, each 
containing 13 340 random positions, and looked for spuri- 
ous optical counterparts in the same way as for catalogue 
sources. 

By fitting the offset distributions with a Gaussian func- 
tion plus a constant pedestal, given by the the contami- 
nation level in that flux range obtained from the control 
samples, we evaluated the total positional uncertainties 
shown in Table |^. 

Similar values have been obtained repeating this anal- 
ysis for the 1530 "close" double radio sources, i.e. looking 
for optical counterparts inside a box of size 40" centered 
on the radio barycentre, and using again random samples 
to evaluate the contamination level. 

On the basis of the estimated positional uncertain- 
ties, for the optical identification procedure of NVSS radio 
sources we thus adopted a search radius of 15". 

The Likelihood Ratio method has subsequently been 
applied to the list of counterparts of pointlike sources, to 
discard those cases that are statistically unlikely to be true 
radio-optical associations. The same method proved to be 
inapplicable to the counterparts of "close" doubles, as in 
this case the hypothesis of Gaussian-distributed positional 
uncertainty, required by the Likelihood Ratio method, is 
not satisfied. As can be seen from Fig. ||, in fact, an excess 
of true identifications is found in the "tails" of the offset 
distributions. These could correspond to identifications of 
distorted radio sources, like Head-Tails, whose morphol- 
ogy is not completely resolved at the low NVSS resolution. 



7. The radiogalaxy sample 

The list of NVSS pointlike radio sources identified with 
EDSGC galaxies brighter than 6j = 20.0 inside a circle of 
radius 15" consists of 1061 candidate counterparts, with 
an average of 254 contaminants from the control samples: 
the initial contamination is thus 24% ± 2%. To this list of 
optical counterparts and to the counterparts found in the 
control samples we applied the modified Likelihood Ratio 
method described in Appendix evaluating LR for each 
source using the positional uncertainty relative to its fiux 
(see Table 0). 

The Likelihood Ratio cutoff value for rejecting a coun- 
terpart as unlikely to be true was found to be LR = LR^, = 
1.9. Out of the initial list of 1061 candidates, the final 
sample of optical identifications of pointlike NVSS radio 
sources thus consists of 926 counterparts satisfying the 
condition Li?* > 1.9, while the number of contaminants 
in this sample, given by the mean number of spurious iden- 
tifications in the control samples which have LR > 1.9, is 
Cn,ed{LR, > 1.9) « 145. 

The contamination percentage in the final sample of 
926 optical counterparts of NVSS pointlike radio sources 
is thus ^ 16% ± 1%, while due to the choice of the cutoff 
value for LR we expect to lose ^ 24 true identifications. 
This corresponds to a completeness of ~ 97 ± 1% and to 
a rehability of ~ 84% ± 1%: we conclude that, with re- 
spect to the initial list of 1061 candidate counterparts, 
the use of the modified Likelihood Ratio has sensibly low- 
ered the contamination level without discarding a large 
number of real radio-optical associations. The identifica- 
tion percentage, expressed as the ratio between the num- 
ber of true identifications and the total number of sources 
for which we looked for an optical counterpart, is about 
6'pointiikc = (926 - 145)/13340 = 6% ± 0.2%. 

For the 1530 "close" doubles we looked for an optical 
counterpart brighter than 6j = 20.0 at a distance < 15" 
from the radio barycentre, finding 169 identifications. The 
number of spurious identifications obtained from the con- 
trol samples is 28 ±5: the contamination percentage in the 
sample of optically identified "close" double radio sources 
is then 16% ± 3%, while the identification percentage is 
Oheii — 9% ± 1%, consistent at the 3ct level with the value 
found for pointlike sources. 

The identification procedure of the 1132 "wide" double 
radio sources has been made as follows: we first looked for 
an optical counterpart inside a radius of 15" both from 
the barycentre position and from the positions of the two 
components, and then inspected those cases where more 
than one identification is found for the same radio source. 

We initially identified 232 positions; in 156 cases we 
identify either the barycentre or one (or both) the com- 
ponents: in such cases, we consider valid the identification 
even if this does not mean that we are keeping the true 
optical counterpart. 

In the remaining cases, a puzzling situation emerges 
as we find a counterpart for both the barycentre and one 
component, or even for the barycentre and the two compo- 
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Fig. 6. Optical identifications of double radio sources in the interval 50" < D < 100": images are taken from the 
Digitized Sky Survey and contours represent radio emission in the NVSS. From left to right and from top to bottom: 
B00025, B01640, B01094, B02230, B02322, B02348, B00532, B00065, for which we identify the barycentre and one 
component with two different galaxies; B00258, B00698, B01631 for which we identify the barycentre and 1 component 
with the same galaxy and the second component with a different galaxy, and finally B01471 for which the barycentre 
and the two components are identified with 3 different galaxies (see text). 



nents, so that a decision on the most reliable identification 
is difficult to make. To discard or retain an identification, 
we decided to proceed as follows: when we identify the 
barycentre and 1 component with the same galaxy (25 
cases), we assume that this happens because in the ex- 
traction algorithm we allowed high flux ratios. In fact, 
when a common identification is found for the barycentre 
and for one of the components (normally the strongest) we 
consider valid the association with the barycentre, even if 
this is somewhat arbitrary. Large values of S1/S2, up to 
10, are a feature introduced by our extraction algorithm 
and are not representative of the true distribution of flux 
ratios for double radio sources (see discussion in Sect. 3.4). 
To perform an analysis of fiux ratios and arm-length ra- 
tios, and to compare it with other samples, we would need 
radio maps with much better resolution. 



In the 12 cases when we identify the barycentre and 
one component with two different galaxies, or the barycen- 
tre and one component with the same galaxy but at the 
same time we find a different identification for the sec- 
ond component, or finally we identify separately both the 
barycentre and the two components, we decided which is 
the most likely identification by visually inspecting the 
field. These few cases are shown in Fig. ^. 

When the radio source structure is similar to a "head- 
tail" morphology, we considered the counterpart associ- 
ated to the component corresponding to the radio source 
"head" . In the presence of extended but symmetric radio 
morphologies we retain the identification in the barycen- 
tre. An example of spurious double can be represented by 
B1471 in Fig. ^, the only case where we identify at the 
same time the barycentre and the two components with 
different galaxies. In this case we considered valid the two 
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counterparts associated to the components of the double 
source. 

By applying these criteria, we obtained a list of 193 op- 
tical counterparts of "wide" double radio sources. Because 
of the presence of an "intrinsic" radio contamination, 
and given the subjective method adopted in the selec- 
tion of true counterparts, it is possible to give only a 
lower limit to the contamination present in this list of 
identifications. The number of spurious identifications is 
^ X iV X popt = TT X (15)2 X (1132 X 3) X 2.7 x 10"^ = 65, 
that is a contamination percentage of ~ 28%. 

A summary of the different contamination levels as 
well as the number of optical counterparts found for each 
radio morphology we are considering is given in Table ^ 

The final radiogalaxy sample thus consists of 1288 
sources optically identified with galaxies brighter than 
6j = 20 in the EDSGC: in Fig. the magnitude and 
flux distributions for the radiogalaxy sample are shown. 
The overall contamination in the radiogalaxy sample is 
~ 18%. The identification percentage we find for NVSS 
sources is in agreement with what found by Magliocchetti 
& Maddox ( 2001 ) for optical identifications of higher- 
resolution FIRST radio sources with APM galaxies in the 
equatorial region, once scaled to our radio flux and opti- 
cal magnitude limits. In Paper II we will use a subset of 
our radiogalaxy sample, characterized by a higher relia- 
bility, to look for intermediate redshifts cluster candidates 
associated to NVSS radio sources. 

8. Summary 

Aim of this work is to build a sample of radio-optically 
selected clusters of galaxies at intermediate redshift in or- 
der to study the evolution and general properties of groups 
and clusters as well as the effect of the environment on the 
radio emission phenomenon. In this paper we have dis- 
cussed the compilation of a radio source catalogue from 
31 NVSS radio maps covering the South Galactic Pole re- 
gion, and the search of optical counterparts of these radio 
sources. The main reason to build a radio source catalogue 
alternative to the NVSS-NRAO publicly available one has 
been the need to classify radio sources according to their 
morphology - unresolved or double ~ so to properly search 
for their optical counterparts. 

Our radio source catalogue has been built by detect- 
ing emission peaks above the detection threshold Sp ^ 
2.5 mjy beam~^ and fitting Gaussian components with 
FWHM equal to the NVSS beam size to the selected 
peaks. The source detection algorithm first attempts a 
one-component fit to each peak and, depending on the 
root mean square of the fit and on the distance between 
two neighbour peaks, if necessary a two-components fit is 
performed. 

Classification of double radio sources has been done 
by first allowing the separation between components to 
be as large as 2.5' and compiling a first list of "tentative" 
double sources. Then, given the NVSS low resolution, a 
detailed analysis to discriminate between pointlike and 



Table 2. The radiogalaxy sample: for each radio mor- 
phology class the number of optical identifications and the 
contamination level are shown. The last row gives these 
quantities for the whole sample. 



Radio 

Morphology 


N 


Contamination 


Pointlike 


926 


16 ± 1% 


"Close" double 


169 


16% ± 3% 


"Wide" double 


193 


28% 


Total 


1288 


18 % 



double sources has been done by studying the probabil- 
ity of classifying two single, non interacting components 
as a double system on the basis of their separation. From 
this analysis we found that the probability of two sources 
being a physically bound system is negligible when their 
distance is greater than 100". These doubles have been 
removed from the "tentative" list and included as single 
components among the unresolved sources while, on the 
opposite, the classification of double sources is correct for 
those systems having D < 50" ("close" doubles). In the 
intermediate range 50" < D < 100" the number of ex- 
pected spurious and true double sources are equivalent. 
These cases ("wide" doubles) have been included among 
double sources but for them a more careful optical identi- 
fication procedure has been performed. 

The final radio source catalogue consists of 13 340 sin- 
gle and 2662 double radio sources over « 550 sq. degrees 
of sky, and is complete down to S*? = 2.5 mJy beam~^. 

A quantitative test to assess the accuracy of the ra- 
dio source extraction algorithm has been made compar- 
ing fluxes and positions of a set of radio sources in our 
catalogue with the correspondent values in the NVSS- 
NRAO catalogue. Since the NVSS-NRAO catalogue does 
not classify double radio sources this analysis has been 
possible for pointlike sources only. We found that our re- 
sults are in agreement with the ones in the NVSS-NRAO 
catalogue, well i nside the predicted errors for the NVSS 
(Condon et al. 



19981) 



For what concerns double radio 
sources, we made a qualitative analysis by visually in- 
specting a set of "close" and "wide" doubles and looking 
at their characteristics in the NVSS-NRAO catalogue. We 
found a good agreement in flux and positions for "close" 
doubles, while in most cases "wide" ones clearly show a 
classical double radio morphology on the maps. In such 
cases, where the optical counterpart should be looked near 
the radio barycentre, the use of the NVSS-NRAO cata- 
logue without a re-processing to detect double sources, 
would result in a loss of optical identifications and thus in 
a less complete radiogalaxy sample. 

Optical identifications of radio sources in our cat alogue 
have been made with EDSGC galaxies (Nichol et al. 2000 ) 
down to a limiting magnitude of bj = 20.0 and adopting 
a search radius of 15". 
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Fig. 7. Peak flux (a) and magnitude (b) distributions for the 1288 NVSS radio sources optically identified with 
EDSGC galaxies. There are 67 radiogalaxies brighter than 80 mJy beam^^ not shown in the flux histogram. 



Different strategies have been applied for the search of 
optical counterparts of pointlike or double radio sources. 
For the latter, the probability of having classified as a dou- 
ble system two physically disjointed sources on the basis 
of their superposition in the sky is in fact dependent on 
the distance between the two components. The optical 
identification of the 13 340 pointlike radio sources led to 
a sample of 926 radiogalaxies. The statistical complete- 
ness and reliability of this sample have been evaluated by 
means of the modified Likelihood Ratio method proposed 
by De Ruiter et al. ( 1977| ) (see Appendix |b]) , to properly 
take into account the true optical surface distribution of 
galaxies in the sky. This sample is complete to 97% ±1% 
and reliable to 84%±1%, with an identification percentage 
of 6% ± 0.2%. 

The optical identification of 1530 "close" double radio 
sources (distance between components D < 50") has been 
made looking for a counterpart near the barycentre posi- 
tion. For these sources, the probability of being a spurious 
double is low, < 13%. We optically identified 169 barycen- 
trcs of "close" doubles; in this case it was not possible to 
apply the modified Likelihood Ratio method to evaluate 
the reliability and completeness of the sample. An esti- 
mate of the contamination level has been computed as 
the probability of chance radio-optical superposition on 
the basis of the average observed optical surface galaxy 
density. We found a contamination of 16% ± 3% for opti- 
cal identifications of "close" double radio sources. 

Optical identifications of "wide" doubles (distance be- 
tween components 50" < D < 100") are made difficult 
by the high percentage of expected radio misclassification: 
the number of true radio associations is in fact comparable 
with the number of radio contaminants. We thus looked 
for optical counterparts both near the radio barycentre 
and near the radio components positions, visually inspect- 



ing those cases where more than one optical identification 
is found for the same radio source. We found a list of 193 
optical counterparts of "wide" double radio sources, with 
a contamination of the order of ~ 28%: this contamina- 
tion level must be seen as a lower limit, as it does not 
take into account the joint probability of having a optical 
spurious identification near the barycentre of a spurious 
double radio source. 

The final sample thus lists 1288 radiogalaxies and 
represents a valuable opportunity for the study of the 
multi-wavelength properties of the radiogalaxy popula- 
tions down to a low flux level. 

This sample has been used to look for galaxy clus- 
ters associated to NVSS radiogalaxies: in a following pa- 
per (Zanichelli et al. 2001) we discuss the cluster selection 
strategy and the first observational results, that prove this 
technique to be a powerful tool for the selection of galaxy 
groups and clusters at intermediate redshift. 
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Appendix A: the Gaussian Fitting Algorithm 

To define flux and accurate positions of radio sources 
from NVSS maps, we developed a code which performs 
a Gaussian bidimensional fit by means of a minimiza- 
tion process. Starting from M functions in N variables, 



a;2, xn), the routine MINSQ (Pomentale, 1968) 
minimizes the sum: 



4>^{xi,X2, ...,a;N 



M , 

) = fk{xi,X2, . 

fe=l 



, XN j 



(A.l) 



where M > N > 2. The minimization process is iter- 
ated until the difference between the function before and 
after the minimization is lower than a user-selected value 
( "stopping rule" ) , or until a pre-defined maximum number 



of iteration is reached. Each source to be fitted is repre- 
sented with a circular Gaussian of FWHM = a and peak 
amplitude A: 



G{x, y) ~ A e 2^ 



(A.2) 



If the source image is composed of M independent mea- 
sures of the amplitude Ofc, each one with a known associ- 
ated error cr, the fk can be defined as: 



fk 



[ofe - G{xk,ykW 



(A.3) 



Inserting this expression for the fk in the (A.l), the 
maximum-likelihood fit would be the one which minimizes 
the In our case, the errors on the individual measure- 
ments are not a priori known: as a first approximation 
we could assume that they are constant over the image 
and equal to the mean survey rms (w 0.45 mJy beam~^), 
but this assumption fails in presence of bright sources. We 
thus expressed the fk functions simply as the unweighted 
quadratic differences between the data and the fit at each 
pixel: 



M 



M 



= ^fk^^{ak- G{xk,yk)f 



(A.4) 



k=l 



k=l 



The value of (ji^^^ obtained from the minimization pro- 
cedure and normalized to the number of functions M is 
the estimated error associated to the fit procedure. This 
uncertainty can be expressed as the sum in quadrature 
of a constant term, dependent on the map noise, plus a 
term proportional to the source fiux through an a priori 
unknown constant: 



FF= + (c X 5p)2 



(A.5) 



Thus, FF is not a good indicator of fit reliability due to 
its dependence on source fiux. To correct for this depen- 
dence, we determined c as follows: first, we evaluated e 
by analysing the distribution of FF for faint sources, for 
which the fiux term in ( A.5 ) is negligible and the median 
value of the distribution of FF is a good appr oximation 
for e. Second, introducing this value of e in (A.5) and con- 
sidering bright sources, the value for the constant c can 
be determined. 

The fit uncertainty associated to each source is thus 
writable as: 



E = ^FF2 - (c X Svf 



(A.6) 



and is evaluated both for 1-component and for 2- 
components fit. 

Starting from source pixel coordinates. Right 
Ascension and Declination have been computed by means 
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of the conversion formulae for the sine projection used in 
the NVSS: 



a = do + arctan 



cos So a/I 



y sin 5o 



(A.7) 



a given distance r from the radio source, we are deahng 
with the true identification or with a contaminant. 

The Likehhood Ratio method (De Ruiter et al. [1977| ) 
makes use of the Bayes theorem to express p(id \ r) and 
p(c I r) in terms of LR, that is by means of the correspon- 
dent a priori probabilities p(r \ id) and p{r \ c): 



S — arcsin 



(y cos So + sin y/l- -y^^ 



(A. 



where {ao,So) are the central Right Ascension and 
Declination of the map, and {a, S) are those of a source 
with known pixel coordinates {x,y). 

Appendix B: The modified likelihood ratio - using 
control samples 

The sample resulting from an optical identification pro- 
gram is characterized by a contamination level, which de- 
pends on the number of spurious identifications, and a 
completeness level, which is the percentage of true radio- 
optical associations we were able to correctly identify on 
the basis of the chosen search radius. 

We have a "correct" identification when the combined 
radio and optical positional uncertainties are such that 
the true counterpart of the radio source, if it exists, does 
not lie outside the area defined by the search radius and, 
at the same time, the first (nearest) contaminant is not 
closer to the radio source than the identification itself. 

In the case when a correct identification does not exist 
(empty field), we will misidentify as true a contaminant 
each time a galaxy is found inside the search region. The 
percentage of identification is defined as the fraction of 
correct identifications with respect to the total number of 
radio sources for which an optical counterpart has been 
looked for. 

The completeness of an optical identification program 
represents the fraction of correct identifications among the 
radio sources having an optical counterpart, while the re- 
liability is defined as the fraction of counterparts that are 
true radio-optical associations, i.e. it is the complement to 
the contamination level in the sample. 

Under the hypothesis that the positions of a radio 
source and its optical counterpart are intrinsically coin- 
cident, it is possible to define the a priori probability 
p{r I id) that the radio-optical offset is found in the dis- 
tance interval (r, r-\-dr) due to the positional uncertainties. 
Similarly, under the hypothesis that the counterpart is a 
contaminant, it is possible to define the a priori probabil- 
ity p{r I c) that the contaminant is found inside (r, r + dr). 

For each radio source it is then possible to define the 
Likelihood Ratio LR as the ratio between these two prob- 
abilities: an optical counterpart is considered as the true 
radio-optical association if p{r \ id) is greater than p{r \ c) 
by a factor Li?* to be determined. 

Nevertheless, what is actually computable from an 
identification program are the a posteriori probabilities 
p{id I r) and p{c \ r) that, having found a counterpart at 



p{id\r) = p{id) x p(r\id) / p{r) 
p{c\r) — p{c) X p{r\c) / p{r) 



(B.l) 
(B.2) 



where p{r) is the probability to find an object (irrespective 
if a contaminant or the true identification) at a distance 
between r and r + dr from the radio source; p{id) is the 
a priori probability to find the optical counterpart of a 
radio source and p{c) — 1 — p{id) the probability to find a 
spurious identification. 

By applying the Bayes theorem and under the assump- 
tion that the true identification is always the nearest ob- 
ject to the radio source, p[id \ r) and p{c \ r) can be 
written as: 



p{id\r) = 
p{c\r) = 



dLRjr) 
§LR{r) + 1 
1 

§LR{r) + 1 



(B.3) 
(B.4) 



where •& — 6/ {1 — 6) and 9 is the a priori unknown percent- 
age of expected true identifications. The latter can be es- 
timated as the sum of the probabilities for each individual 
identification to be real, normalized to the total number 
of counterparts found. The quantities 9 and p{id \ r) are 
not independent and the solution for 9 is found iteratively. 
The total number of expected true identifications, iVjd, is 
given by iVid = 9Ntc,t, where iVtot is the total number of 
radio sources for which an optical counterpart is searched. 
Once 9 is determined, the reliability and completeness of 
the final identification sample can be defined as a function 
of the cutoff value LR^,: 



C=l- M^d\r)/N,d 



LRi<L 



R=l- p^{c\r)/N{LR> L) 



(B.5) 



(B.6) 



LRi>L 



Where N{LR > L) is the total number of identifications 
having LR > L. The value for LR^ is determined by 
studying the behaviour of C and i? as a function of LR, 
finding the value of LR that maximizes (C -I- R)/2. 

In general, the value of LR^, is close to ~ 2.0, that 
means to consider true all those identifications for which 
the a priori probability of having correctly identified the 
radio source is twice the a priori probability of having a 
contaminant. 
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One critical factor in the Likelihood Ratio method pro- 
posed by De Ruiter et al. (1977) is the assumption of 
a constant optical surface density of galaxies. This does 
not allow to keep into account the real galaxy clustering 
and thus can heavily affect the estimates of C and R. To 
avoid this limitation, we applied a modified version of this 
method, which makes use of control samples to properly 
evaluate the contamination level in the optical identifica- 
tion samples. 

Control samples of the same size as the radio source 
catalogue are built by assigning to each entry a random 
position and, once defined the radius of the search region, 
optically identified with galaxies as is done for the ra- 
diogalaxy sample. We can write the expected number of 
contaminants in the final identification sample as the av- 
erage of the spurious identifications found in each control 
sample: Cmcd- The expected number of true identifications 
will thus be given by the difference between the total num- 
ber of counterparts found, N ^ and the mean number of 
contaminants: A^id — N — Cmcd- We can obtain also the 
identification percentage 9 = A^id/^tot, where iVtot is the 
total number of radio sources for which we have searched 
an optical counterpart. 

According to the Likelihood Ratio method, the com- 
pleteness expresses the fraction of real identifications for 
which LR > L, so we can write: 



C=l- {N{LR <L)- Cn,cd{LR < i))/iVid (B.7) 

The term in parenthesis is the number of true identifica- 
tions (i.e. excluding the contaminants) that are lost due 
to the choice of the cutoff value Li?*. 

Similarly, we can write for the reliability: 



R = l- {Cn,cd{LR > L))/N{LR > L) (B.8) 

That is, R is defined in terms of the fraction of contami- 
nants that are included in the sample due to the choice of 
the cutoff value LR^,. 
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